We present and evaluate GPU Bucket Sort, a parallel deterministic sample sortalgorithm for many-core GPUs. Our method is considerably faster than ThrustMerge (Satish et.al., Proc. IPDPS 2009), the best comparison-based sortingalgorithm for GPUs, and it is as fast as the new randomized sample sort forGPUs by Leischner et.al. (to appear in Proc. IPDPS 2010). Our deterministicsample sort has the advantage that bucket sizes are guaranteed and thereforeits running time does not have the input data dependent fluctuations that canoccur for randomized sample sort.
展开▼